In the United States tornadoes are a commonly occuring natural disaster, that can occur throughout the year. Using data collected by the NCEI and NOAA (https://www.ncei.noaa.gov/pub/data/swdi/stormevents/csvfiles/) about tornadoes during 2024, analysis will be performed to see what areas are at most risk for tornadoes by looking at tornadoes that actually touched the ground and moved. Additionally, further analysis will be done to look into if time of year is a factor. The hypothesis is that there will be a difference in which locations are at risk of a tornado depending on the season of the year that it is.
import numpy as np
import pandas as pd
import geopandas as gpd
from geopandas import GeoSeries, GeoDataFrame
from shapely.geometry import Point
import mapclassify
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import cartopy.crs as ccrs # import projection
import cartopy.feature as cf # import features
import json
import folium
import mplleaflet
Removing columns that will not be used when analyzing the data.
df = pd.read_csv('Tornadoes2024.csv') #read in data
df = df.loc[:, ['EVENT_ID','STATE','MONTH_NAME','BEGIN_LOCATION','END_LOCATION','BEGIN_LAT','END_LAT','BEGIN_LON','END_LON']] #keep necesarry variables for analysis
df=df.dropna() #remove null values
df.head() #print first 5 lines
Entries where both the longitude and latitude do not change are considered "tornado possible weather" where they may have simply been rotation in the clouds or spiraling cold and hot fronts. This analysis is looking for tornadoes that actually touched the ground and could actually cause damage.
df_move = df[(df.BEGIN_LAT - df.END_LAT != 0) | (df.BEGIN_LON - df.END_LON != 0)] #keep values where either the lon or lat changes
df_move.head() #print first 5 lines
Investigate what the data looks like without distinguishing between each season.
extant = [-140,-50,20, 45]
fig=plt.figure(figsize=(24,18))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent(extant) #set the extant of the map
ax.gridlines() #add gridlines
ax.coastlines(resolution='50m') #add coastlines
ax.add_feature(cf.LAKES) #add lakes
ax.add_feature(cf.STATES) #add states
plt.scatter(df_move['BEGIN_LON'],df_move['BEGIN_LAT'])
Use a 3 month period to the determine each season as follows:
#create separate datasets for each season
df_winter = df_move[(df.MONTH_NAME == 'December') | (df.MONTH_NAME == 'January') | (df.MONTH_NAME == 'February')]
df_spring = df_move[(df.MONTH_NAME == 'March') | (df.MONTH_NAME == 'April') | (df.MONTH_NAME == 'May')]
df_summer = df_move[(df.MONTH_NAME == 'June') | (df.MONTH_NAME == 'July') | (df.MONTH_NAME == 'August')]
df_fall = df_move[(df.MONTH_NAME == 'September') | (df.MONTH_NAME == 'October') | (df.MONTH_NAME == 'November')]
Plot a graph for each season to get an idea of how tornadoes for each individual season are spread across the continental United States.
extant = [-140,-50,20, 45] #define outer border of map
fig=plt.figure(figsize=(24,18)) #define display size of figure
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent(extant) #set the extant of the map
ax.gridlines() #add gridlines
ax.coastlines(resolution='50m') #add coastlines to the map
ax.add_feature(cf.LAKES) #add lakes
ax.add_feature(cf.STATES) #add states
plt.scatter(df_winter['BEGIN_LON'],df_winter['BEGIN_LAT']) #plot the information of df_winter
#process same as it was for winter
extant = [-140,-50,20, 45]
fig=plt.figure(figsize=(24,18))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent(extant)
ax.gridlines()
ax.coastlines(resolution='50m')
ax.add_feature(cf.LAKES)
ax.add_feature(cf.STATES)
plt.scatter(df_spring['BEGIN_LON'],df_spring['BEGIN_LAT'])
#process same as it was for winter
extant = [-140,-50,20, 45]
fig=plt.figure(figsize=(24,18))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent(extant)
ax.gridlines()
ax.coastlines(resolution='50m')
ax.add_feature(cf.LAKES)
ax.add_feature(cf.STATES)
plt.scatter(df_summer['BEGIN_LON'],df_summer['BEGIN_LAT'])
#process same as it was for winter
extant = [-140,-50,20, 45]
fig=plt.figure(figsize=(24,18))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent(extant)
ax.gridlines()
ax.coastlines(resolution='50m')
ax.add_feature(cf.LAKES)
ax.add_feature(cf.STATES)
plt.scatter(df_fall['BEGIN_LON'],df_fall['BEGIN_LAT'])
Plot all the graphs onto one singular map with winter being represented in blue, spring in red, summer in green, and fall in black. This helps get an idea if there is actually a distinct visible difference between seasons that cannot be seen on their individual maps.
#process same as it was for winter
extant = [-140,-50,20, 45]
fig=plt.figure(figsize=(24,18))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent(extant)
ax.gridlines()
ax.coastlines(resolution='50m')
ax.add_feature(cf.LAKES)
ax.add_feature(cf.STATES)
#add individual scatterplot onto the map, set x and y values, the color, shape of marker, and labels the plot
ax.scatter(x = df_winter['BEGIN_LON'],y = df_winter['BEGIN_LAT'], s=10, c='b', marker="s", label='winter')
ax.scatter(x = df_spring['BEGIN_LON'],y = df_spring['BEGIN_LAT'], s=10, c='r', marker="s", label='spring')
ax.scatter(x = df_summer['BEGIN_LON'],y = df_summer['BEGIN_LAT'], s=10, c='g', marker="s", label='summer')
ax.scatter(x = df_fall['BEGIN_LON'],y = df_fall['BEGIN_LAT'], s=10, c='k', marker="s", label='fall')
plt.legend(loc='lower left', fontsize = '20') #create a legend for the visualization
There does seem to be a difference in what areas are at risk for tornadoes depending on the season that it currently is after looking at tornado data for all tornadoes that touched ground in 2024. In the winter, tornadoes generally touch down in relatively warmer climates and primarily along the coastlines of the continental United States. In the spring, the risk of tornadoes moves inwards. During this season, most of the central states, especially the Midwestern ones, in the United States are at an elevated risk of tornadoes. In the summer, the risk of tornadoes becomes more spread out. During the summer tornadoes are also seen in a fairly large quantity in the Great Plains region as well as in the Midwest and east coast. Finally, during the fall, the risk of tornadoes seems to move south and more toward warmer climates again. Here, only the southern Midwestern states, states with historically warmer climates, and southern coastal states are at an elevated risk of tornadoes. In the combined visualization, this information can really be seen as there seems to be fairly distinct clusters of similarly colored points across the United States. Potential further analysis may include looking at year-over-year data to see if this information is simply a one-off result of 2024 or a consistent trend, or looking into how specific geographic features, such as lakes or mountains, could also affect tornado occurances.